Recognizing Text Similarity

نویسندگان

  • Ozlem Uzuner
  • Randall Davis
چکیده

Related Work in Text Similarity Recognition: Existing text similarity detection systems recognize verbatim similarities between documents but do not pay attention to similarity in expression. SCAM [4, 5], developed in the Stanford Digital Library looks for verbatim copies of text documents by fingerprinting documents and checking these fingerprints against a repository of previously known fingerprints. SCAM looks for overlaps between verbatim text strings to identify partial similarity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SemKer: Syntactic/Semantic Kernels for Recognizing Textual Entailment

In this paper we describe the SemKer system participating to the fifth Recognizing of Textual Entailment (RTE5) challenge. The major novelty with respect to the systems with which we participated to the previous challenges is the use of semantic knowledge based on Wikipedia. More specifically, we used it to enrich the similarity measure between pairs of text and hypothesis (i.e. the tree kernel...

متن کامل

Recognizing Textual Entailment Is lexical similarity enough?

We describe the system we used at the PASCAL-2005 Recognizing Textual Entailment Challenge. Our method for recognizing entailment is based on calculating “directed” sentence similarity: checking the directed “semantic” word overlap between the text and the hypothesis. We use frequency-based term weighting in combination with two different lexical similarity measures. Although one version of the...

متن کامل

Text Based Similarity Metrics and Deltas for Semantic Web Graphs

Recognizing that two Semantic Web documents or graphs are similar and characterizing their differences is useful in many tasks, including retrieval, updating, version control and knowledge base editing. We describe several text-based similarity metrics that characterize the relation between Semantic Web graphs and evaluate these metrics for three specific cases of similarity: similarity in clas...

متن کامل

The Evaluation of Sentence Similarity Measures

The ability to accurately judge the similarity between natural language sentences is critical to the performance of several applications such as text mining, question answering, and text summarization. Given two sentences, an effective similarity measure should be able to determine whether the sentences are semantically equivalent or not, taking into account the variability of natural language ...

متن کامل

Identifying Semantic Divergences in Parallel Text without Annotations

Recognizing that even correct translations are not always semantically equivalent, we automatically detect meaning divergences in parallel sentence pairs with a deep neural model of bilingual semantic similarity which can be trained for any parallel corpus without any manual annotation. We show that our semantic model detects divergences more accurately than models based on surface features der...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002